The Spoken BNC2014
نویسندگان
چکیده
Abstract This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthographically transcribed conversations among L1 speakers English from across UK, recorded in years 2012–2016. After showing that a survey recent history corpora spoken justifies compilation this new corpus, we describe main stages BNC2014’s creation: design, data and metadata collection, transcription, XML encoding, annotation. In doing so aim to (i) encourage users approach with sensitivity many methodological issues identified attempted overcome while compiling BNC2014, (ii) inform (future) compilers innovations implemented attempt make construction representing spontaneous speech informal contexts more tractable, both logistically practically, than past.
منابع مشابه
the role of thematic structure in comprehending spoken language
in fact this study is concerned with the relationship between the variation in thematice structure and the comprehension of spoken language. so the study focused on the following questions: 1. is there any relationship between thematic structure and the comprehension of spoken language? 2. which of the themes would have greated thematic force and be easier for the subjects to comprehend? accord...
15 صفحه اولOn the Use of Diary Study to Investigate Avoidance Strategy in Spoken English Courses
In the present study, an attempt is made to investigate the frequency and motives of using avoidance strategies by a group of Iranian intermediate language learners through their own journal writing. The effect of gender on the use of avoidance strategies is to be investigated as well. Thirty nine female and twenty three male learners enrolled in an English language spoken course in a private E...
متن کاملIntroduction: Compiling and analysing the Spoken British National Corpus 2014
For over twenty years, the British National Corpus has been one of the most widely known and used corpora. It is almost impossible to attend an international corpus linguistics conference such as Corpus Linguistics, ICAME (International Computer Archive of Modern and Medieval English), AACL (American Association for Corpus Linguistics) or APCLC (Asia Pacific Corpus Linguistics Conference) witho...
متن کاملNavigating the Spoken Wikipedia
The Spoken Wikipedia project unites volunteer readers of encyclopedic entries. Their recordings make encyclopedic knowledge accessible to persons who are unable to read (out of alexia, visual impairment, or because their sight is currently occupied, e. g. while driving). However, on Wikipedia, recordings are available as raw audio files that can only be consumed linearly, without the possibilit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Corpus Linguistics
سال: 2022
ISSN: ['1569-9811', '1384-6655']
DOI: https://doi.org/10.1075/ijcl.22.3.02lov